---
title: Upgrading from Version 5.x
---

If you have installed, configured, and are using PXF 5.x in your Greenplum Database 5 or 6 cluster, you must perform some upgrade actions when you install PXF 6.x.

<div class="note"><b>Note:</b> If you are using PXF with Greenplum Database 5, you must upgrade Greenplum to version 5.21.2 or newer before you upgrade to PXF 6.x.</div>

The PXF upgrade procedure has three steps. You perform one pre-install procedure, the install itself, and then a post-install procedure to upgrade to PXF 6.x:

-   [Step 1: Perform the PXF Pre-Upgrade Actions](#pxfpre)
-   [Step 2: Install PXF 6.x](#pxfinst)
-   [Step 3: Complete the Upgrade to PXF 6.x](#pxfup)


## <a id="pxfpre"></a>Step 1: Performing the PXF Pre-Upgrade Actions

Perform this procedure before you upgrade to a new version of PXF:

1. Log in to the Greenplum Database coordinator host. For example:

    ``` shell
    $ ssh gpadmin@<coordinator>
    ```

1. Identify and note the version of PXF currently running in your Greenplum cluster:

    ``` shell
    gpadmin@coordinator$ pxf version
    ```


1. Identify the file system location of the `$PXF_CONF` setting in your PXF 5.x PXF installation; you will need this later. If you are unsure of the location, you can find the value in `pxf-env-default.sh`.

1. Stop PXF on each Greenplum host as described in [Stopping PXF](cfginitstart_pxf.html#stop_pxf).

## <a id="pxfinst"></a>Step 2: Installing PXF 6.x

1. Install PXF 6.x and identify and note the new PXF version number.

1. Check out the new installation layout in [About the PXF Installation and Configuration Directories](about_pxf_dir.html).

## <a id="pxfup"></a>Step 3: Completing the Upgrade to PXF 6.x

After you install the new version of PXF, perform the following procedure:

1. Log in to the Greenplum Database coordinator host. For example:

    ``` shell
    $ ssh gpadmin@<coordinator>
    ```

1. You must run the `pxf` commands specified in subsequent steps using the binaries from your PXF 6.x installation. Ensure that the PXF 6.x installation `bin/` directory is in your `$PATH`, or provide the full path to the `pxf` command. You can run the following command to check the `pxf` version:

    ``` shell
    gpadmin@coordinator$ pxf version
    ```

1. (Optional, Advanced) If you want to relocate `$PXF_BASE` outside of `<PXF_INSTALL_DIR>`, perform the procedure described in [Relocating $PXF_BASE](about_pxf_dir.html#movebase).

1. Auto-migrate your PXF 5.x configuration to PXF 6.x `$PXF_BASE`:

    1. Recall your PXF 5.x `$PXF_CONF` setting.
    2. Run the `migrate` command (see [pxf cluster migrate](ref/pxf-cluster.html)). You must provide `PXF_CONF`. If you relocated `$PXF_BASE`, provide that setting as well.

        ``` shell
        gpadmin@coordinator$ PXF_CONF=/path/to/dir pxf cluster migrate
        ```

        Or:

        ``` shell
        gpadmin@coordinator$ PXF_CONF=/path/to/dir PXF_BASE=/new/dir pxf cluster migrate
        ```

        The command copies PXF 5.x `conf/pxf-profiles.xml`, `servers/*`, `lib/*`, and `keytabs/*` to the PXF 6.x `$PXF_BASE` directory. The command also merges configuration changes in the PXF 5.x `conf/pxf-env.sh` into the PXF 6.x file of the same name and into `pxf-application.properties`.
    3. The `migrate` command does not migrate PXF 5.x `$PXF_CONF/conf/pxf-log4j.properties` customizations; you must manually migrate any changes that you made to this file to `$PXF_BASE/conf/pxf-log4j2.xml`.  Note that PXF 5.x `pxf-log4j.properties` is in properties format, and PXF 6 `pxf-log4j2.xml` is `xml` format. See the [Configuration with XML](https://logging.apache.org/log4j/2.x/manual/configuration.html#XML) topic in the Apache Log4j 2 documentation for more information.

1. *If you migrated your PXF 6.x `$PXF_BASE` configuration (see previous step), be sure to apply any changes identified in subsequent steps to the new, migrated directory*.

3. **If you are upgrading from PXF version 5.9.x or earlier** and you have configured any JDBC servers that access Kerberos-secured Hive, you must now set the `hadoop.security.authentication` property to the `jdbc-site.xml` file to explicitly identify use of the Kerberos authentication method. Perform the following for each of these server configs:

    1. Navigate to the server configuration directory.
    2. Open the `jdbc-site.xml` file in the editor of your choice and uncomment or add the following property block to the file:

        ```xml
        <property>
            <name>hadoop.security.authentication</name>
            <value>kerberos</value>
        </property>
        ```
    3. Save the file and exit the editor.

4. **If you are upgrading from PXF version 5.11.x or earlier**: The PXF `Hive` and `HiveRC` profiles (named `hive` and `hive:rc` in PXF version 6.x) now support column projection using column name-based mapping. If you have any existing PXF external tables that specify one of these profiles, and the external table relied on column index-based mapping, you may be required to drop and recreate the tables:

    1. Identify all PXF external tables that you created that specify a `Hive` or `HiveRC` profile.

    2. For *each* external table that you identify in step 1, examine the definitions of both the PXF external table and the referenced Hive table. If the column names of the PXF external table *do not* match the column names of the Hive table:

        1. Drop the existing PXF external table. For example:

            ``` sql
            DROP EXTERNAL TABLE pxf_hive_table1;
            ```

        2. Recreate the PXF external table using the Hive column names. For example:

            ``` sql
            CREATE EXTERNAL TABLE pxf_hive_table1( hivecolname int, hivecolname2 text )
              LOCATION( 'pxf://default.hive_table_name?PROFILE=hive')
            FORMAT 'custom' (FORMATTER='pxfwritable_import');
            ```

        3. Review any SQL scripts that you may have created that reference the PXF external table, and update column names if required.

4. **If you are upgrading from PXF version 5.15.x or earlier**:

    1. The `pxf.service.user.name` property in the `pxf-site.xml` template file is now commented out by default. Keep this in mind when you configure new PXF servers.
    2. The default value for the `jdbc.pool.property.maximumPoolSize` property is now `15`. If you have previously configured a JDBC server and want that server to use the new default value, you must manually change the property value in the server's `jdbc-site.xml` file.
    3. PXF 5.16 disallows specifying relative paths and environment variables in the `CREATE EXTERNAL TABLE` `LOCATION` clause file path. If you previously created any external tables that specified a relative path or environment variable, you must drop each external table, and then re-create it without these constructs.
    4. Filter pushdown is activated by default for queries on external tables that specify the `Hive`, `HiveRC`, or `HiveORC` profiles (named `hive`, `hive:rc`, and `hive:orc` in PXF version 6.x). If you have previously created an external table that specifies one of these profiles and queries are failing with PXF v5.16+, you can deactivate filter pushdown at the external table-level or at the server level:

        1. (External table) Drop the external table and re-create it, specifying the `&PPD=false` option in the `LOCATION` clause.
        2. (Server) If you do not want to recreate the external table, you can deactivate filter pushdown *for all* `Hive*` (named as described [here](access_hdfs.html#hadoop_connectors) in PXF version 6.x) *profile queries using the server* by setting the `pxf.ppd.hive` property in the `pxf-site.xml` file to `false`:

            ``` pre
            <property>
                <name>pxf.ppd.hive</name>
                <value>false</value>
            </property>
            ```

            You may need to add this property block to the `pxf-site.xml` file.

1. Register the PXF 6.x extension files with Greenplum Database (see [pxf cluster register](ref/pxf-cluster.html)). `$GPHOME` must be set when you run this command.

    ``` shell
    gpadmin@coordinator$ pxf cluster register
    ```

    The `register` command copies only the `pxf.control` extension file to the Greenplum cluster. In PXF 6.x, the PXF extension `.sql` file and library `pxf.so` reside in `<PXF_INSTALL_DIR>/gpextable`. You may choose to remove these now-unused files from the Greenplum Database installation *on the Greenplum Database coordinator host, the standby coordinator host, and all segment hosts*. For example, to remove the files on the coordinator host:

    ``` shell
    gpadmin@coordinator$ rm $GPHOME/share/postgresql/extension/pxf--1.0.sql
    gpadmin@coordinator$ rm $GPHOME/lib/postgresql/pxf.so
    ```

1. PXF 6.x includes a new version of the `pxf` extension. You must update the extension in every Greenplum database in which you are using PXF. A database superuser or the database owner must run this SQL command in the `psql` subsystem or in an SQL script:

    ``` sql
    ALTER EXTENSION pxf UPDATE;
    ```

1. Ensure that you no longer reference previously-deprecated features that were removed in PXF 6.0:

    | Deprecated Feature | Use Instead |
    | -------------------|-------------|
    | Hadoop profile names | `hdfs:<profile>` as noted [here](access_hdfs.html#hadoop_connectors) |
    | `jdbc.user.impersonation` property | `pxf.service.user.impersonation` property in the [jdbc&#8209;site.xml](jdbc_cfg.html#jdbcimpers) server configuration file |
    | `PXF_KEYTAB` configuration property | `pxf.service.kerberos.keytab` property in the [pxf&#8209;site.xml](cfg_server.html#pxf-site) server configuration file |
    | `PXF_PRINCIPAL` configuration property | `pxf.service.kerberos.principal` property in the [pxf&#8209;site.xml](cfg_server.html#pxf-site) server configuration file |
    | `PXF_USER_IMPERSONATION` configuration property | `pxf.service.user.impersonation` property in the [pxf&#8209;site.xml](cfg_server.html#pxf-site) server configuration file |

1. PXF 6.x distributes a single JAR file that includes all of its dependencies, and separately makes its HBase JAR file available in `<PXF_INSTALL_DIR>/share`. If you have configured a PXF Hadoop server for HBase access, you must register the new `pxf-hbase-<version>.jar` with Hadoop and HBase as follows:

    1. Copy `<PXF_INSTALL_DIR>/share/pxf-hbase-<version>.jar` to each node in your HBase cluster.
    1. Add the location of this JAR to `$HBASE_CLASSPATH` on each HBase node.
    1. Restart HBase on each node.

1. In PXF 6.x, the PXF Service runs on all Greenplum Database hosts. If you used PXF 5.x to access Kerberos-secured HDFS, you must now generate principals and keytabs for the Greenplum coordinator and standby coordinator hosts, and distribute these to the hosts as described in [Configuring PXF for Secure HDFS](pxf_kerbhdfs.html#procedure).

4. Synchronize the PXF 6.x configuration from the coordinator host to the standby coordinator host and each Greenplum Database segment host. For example:

    ``` shell
    gpadmin@coordinator$ pxf cluster sync
    ```
 
5. Start PXF on each Greenplum host. For example:

    ``` shell
    gpadmin@coordinator$ pxf cluster start
    ```

1. Verify that PXF can access each external data source by querying external tables that specify each PXF server configuration.

